24 research outputs found

    Bundle methods in nonsmooth DC optimization

    Get PDF
    Due to the complexity of many practical applications, we encounter optimization problems with nonsmooth functions, that is, functions which are not continuously differentiable everywhere. Classical gradient-based methods are not applicable to solve such problems, since they may fail in the nonsmooth setting. Therefore, it is imperative to develop numerical methods specifically designed for nonsmooth optimization. To date, bundle methods are considered to be the most efficient and reliable general purpose solvers for this type of problems. The idea in bundle methods is to approximate the subdifferential of the objective function by a bundle of subgradients. This information is then used to build a model for the objective. However, this model is typically convex and, due to this, it may be inaccurate and unable to adequately reflect the behaviour of the objective function in the nonconvex case. These circumstances motivate to design new bundle methods based on nonconvex models of the objective function. In this dissertation, the main focus is on nonsmooth DC optimization that constitutes an important and broad subclass of nonconvex optimization problems. A DC function can be presented as a difference of two convex functions. Thus, we can obtain a model that utilizes explicitly both the convexity and concavity of the objective by approximating separately the convex and concave parts. This way we end up with a nonconvex DC model describing the problem more accurately than the convex one. Based on the new DC model we introduce three different bundle methods. Two of them are designed for unconstrained DC optimization and the third one is capable of solving also multiobjective and constrained DC problems. The finite convergence is proved for each method. The numerical results demonstrate the efficiency of the methods and show the benefits obtained from the utilization of the DC decomposition. Even though the usage of the DC decomposition can improve the performance of the bundle methods, it is not always available or possible to construct. Thus, we present another bundle method for a general objective function implicitly collecting information about the DC structure. This method is developed for large-scale nonsmooth optimization and its convergence is proved for semismooth functions. The efficiency of the method is shown with numerical results. As an application of the developed methods, we consider the clusterwise linear regression (CLR) problems. By applying the support vector machines (SVM) approach a new model for these problems is proposed. The objective in the new formulation of the CLR problem is expressed as a DC function and a method based on one of the presented bundle methods is designed to solve it. Numerical results demonstrate robustness of the new approach to outliers.Monissa käytännön sovelluksissa tarkastelun kohteena oleva ongelma on monimutkainen ja joudutaan näin ollen mallintamaan epäsileillä funktioilla, jotka eivät välttämättä ole jatkuvasti differentioituvia kaikkialla. Klassisia gradienttiin perustuvia optimointimenetelmiä ei voida käyttää epäsileisiin tehtäviin, sillä epäsileillä funktioilla ei ole olemassa klassista gradienttia kaikkialla. Näin ollen epäsileään optimointiin on välttämätöntä kehittää omia numeerisia ratkaisumenetelmiä. Näistä kimppumenetelmiä pidetään tällä hetkellä kaikista tehokkaimpina ja luotettavimpina yleismenetelminä kyseisten tehtävien ratkaisemiseksi. Ideana kimppumenetelmissä on approksimoida kohdefunktion alidifferentiaalia kimpulla, joka on muodostettu keräämällä kohdefunktion aligradientteja edellisiltä iteraatiokierroksilta. Tätä tietoa hyödyntämällä voidaan muodostaa kohdefunktiolle malli, joka on alkuperäistä tehtävää helpompi ratkaista. Käytetty malli on tyypillisesti konveksi ja näin ollen se voi olla epätarkka ja kykenemätön esittämään alkuperäisen tehtävän rakennetta epäkonveksissa tapauksessa. Tästä syystä väitöskirjassa keskitytään kehittämään uusia kimppumenetelmiä, jotka mallinnusvaiheessa muodostavat kohdefunktiolle epäkonveksin mallin. Pääpaino väitöskirjassa on epäsileissä optimointitehtävissä, joissa funktiot voidaan esittää kahden konveksin funktion erotuksena (difference of two convex functions). Kyseisiä funktioita kutsutaan DC-funktioiksi ja ne muodostavat tärkeän ja laajan epäkonveksien funktioiden osajoukon. Tämä valinta mahdollistaa kohdefunktion konveksisuuden ja konkaavisuuden eksplisiittisen hyödyntämisen, sillä uusi malli kohdefunktiolle muodostetaan yhdistämällä erilliset konveksille ja konkaaville osalle rakennetut mallit. Tällä tavalla päädytään epäkonveksiin DC-malliin, joka pystyy kuvaamaan ratkaistavaa tehtävää tarkemmin kuin konveksi arvio. Väitöskirjassa esitetään kolme erilaista uuden DC-mallin pohjalta kehitettyä kimppumenetelmää sekä todistetaan menetelmien konvergenssit. Kaksi näistä menetelmistä on suunniteltu rajoitteettomaan DC-optimointiin ja kolmannella voidaan ratkaista myös monitavoitteisia ja rajoitteellisia DC-optimointitehtäviä. Numeeriset tulokset havainnollistavat menetelmien tehokkuutta sekä DC-hajotelman käytöstä saatuja etuja. Vaikka DC-hajotelman käyttö voi parantaa kimppumenetelmien suoritusta, sitä ei aina ole saatavilla tai mahdollista muodostaa. Tästä syystä väitöskirjassa esitetään myös neljäs kimppumenetelmä konvergenssitodistuksineen yleiselle kohdefunktiolle, jossa kerätään implisiittisesti tietoa kohdefunktion DC-rakenteesta. Menetelmä on kehitetty erityisesti suurille epäsileille optimointitehtäville ja sen tehokkuus osoitetaan numeerisella testauksella Sovelluksena väitöskirjassa tarkastellaan datalle klustereittain tehtävää lineaarista regressiota (clusterwise linear regression). Kyseiselle sovellukselle muodostetaan uusi malli hyödyntäen koneoppimisessa käytettyä SVM-lähestymistapaa (support vector machines approach) ja saatu kohdefunktio esitetään DC-funktiona. Näin ollen yhtä kehitetyistä kimppumenetelmistä sovelletaan tehtävän ratkaisemiseen. Numeeriset tulokset havainnollistavat uuden lähestymistavan robustisuutta ja tehokkuutta

    Bundle-based descent method for nonsmooth multiobjective DC optimization with inequality constraints

    Get PDF
    Multiobjective DC optimization problems arise naturally, for example, in data classification and cluster analysis playing a crucial role in data mining. In this paper, we propose a new multiobjective double bundle method designed for nonsmooth multiobjective optimization problems having objective and constraint functions which can be presented as a difference of two convex (DC) functions. The method is of the descent type and it generalizes the ideas of the double bundle method for multiobjective and constrained problems. We utilize the special cutting plane model angled for the DC improvement function such that the convex and the concave behaviour of the function is captured. The method is proved to be finitely convergent to a weakly Pareto stationary point under mild assumptions. Finally, we consider some numerical experiments and compare the solutions produced by our method with the method designed for general nonconvex multiobjective problems. This is done in order to validate the usage of the method aimed specially for DC objectives instead of a general nonconvex method.</p

    Aggregate subgradient method for nonsmooth DC optimization

    Get PDF
    The aggregate subgradient method is developed for solving unconstrained nonsmooth difference of convex (DC) optimization problems. The proposed method shares some similarities with both the subgradient and the bundle methods. Aggregate subgradients are defined as a convex combination of subgradients computed at null steps between two serious steps. At each iteration search directions are found using only two subgradients: the aggregate subgradient and a subgradient computed at the current null step. It is proved that the proposed method converges to a critical point of the DC optimization problem and also that the number of null steps between two serious steps is finite. The new method is tested using some academic test problems and compared with several other nonsmooth DC optimization solvers. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature

    Splitting Metrics Diagonal Bundle Method for Large-Scale Nonconvex and Nonsmooth Optimization

    Get PDF
    Nonsmooth optimization is traditionally based on convex analysis and most solution methods rely strongly on the convexity of the problem. In this paper, we propose an efficient diagonal bundle method for nonconvex large-scale nonsmooth optimization. The novelty of the new method is in different usage of metrics depending on the convex or concave behaviour of the objective at the current iteration point. The usage of different metrics gives us a possibility to better deal with the nonconvexity of the problem than the sole — the most commonly used and quite arbitrary — downward shifting of the piecewise linear model does. The convergence of the proposed method is proved for semismooth functions that are not necessary differentiable nor convex. The numerical experiments have been made using problems with up to million variables. The results to be presented confirm the usability of the new method.</p

    Double bundle method for finding clarke stationary points in nonsmooth dc programming

    Get PDF
    The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new escape procedure which enables us to guarantee approximate Clarke stationarity for solutions by utilizing the DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not approximate Clarke stationary, then the escape procedure returns a descent direction. With this escape procedure, we can avoid some shortcomings encountered when criticality is used. The finite termination of the double bundle method to an approximate Clarke stationary point is proved by assuming that the subdifferentials of DC components are polytopes. Finally, some encouraging numerical results are presented

    Oscar : Optimal subset cardinality regression using the L0-pseudonorm with applications to prognostic modelling of prostate cancer

    Get PDF
    Author summaryFeature subset selection has become a crucial part of building biomedical models, due to the abundance of available predictors in many applications, yet there remains an uncertainty of their importance and generalization ability. Regularized regression methods have become popular approaches to tackle this challenge by balancing the model goodness-of-fit against the increasing complexity of the model in terms of coefficients that deviate from zero. Regularization norms are pivotal in formulating the model complexity, and currently L-1-norm (LASSO), L-2-norm (Ridge Regression) and their hybrid (Elastic Net) dominate the field. In this paper, we present a novel methodology that is based on the L-0-pseudonorm, also known as the best subset selection, which has largely gone overlooked due to its challenging discrete nature. Our methodology makes use of a continuous transformation of the discrete optimization problem, and provides effective solvers implemented in a user friendly R software package. We exemplify the use of oscar-package in the context of prostate cancer prognostic prediction using both real-world hospital registry and clinical cohort data. By benchmarking the methodology against existing regularization methods, we illustrate the advantages of the L-0-pseudonorm for better clinical applicability, selection of grouped features, and demonstrate its applicability in high-dimensional transcriptomics datasets.In many real-world applications, such as those based on electronic health records, prognostic prediction of patient survival is based on heterogeneous sets of clinical laboratory measurements. To address the trade-off between the predictive accuracy of a prognostic model and the costs related to its clinical implementation, we propose an optimized L-0-pseudonorm approach to learn sparse solutions in multivariable regression. The model sparsity is maintained by restricting the number of nonzero coefficients in the model with a cardinality constraint, which makes the optimization problem NP-hard. In addition, we generalize the cardinality constraint for grouped feature selection, which makes it possible to identify key sets of predictors that may be measured together in a kit in clinical practice. We demonstrate the operation of our cardinality constraint-based feature subset selection method, named OSCAR, in the context of prognostic prediction of prostate cancer patients, where it enables one to determine the key explanatory predictors at different levels of model sparsity. We further explore how the model sparsity affects the model accuracy and implementation cost. Lastly, we demonstrate generalization of the presented methodology to high-dimensional transcriptomics data.Peer reviewe

    New bundle method for clusterwise linear regression utilizing support vector machines

    Get PDF
    Clusterwise linear regression (CLR) aims to simultaneously partition a data into a given number of clusters and find regression coefficients for each cluster. In this paper, we propose a novel approach to solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of CLR is represented as an unconstrained nonsmooth optimization problem, where the objective function is a difference of convex (DC) functions. A method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed to solve it. Numerical experiments are made to validate the reliability of the new formulation and the efficiency of the proposed method. The results show that the SVM approach is beneficial in solving CLR problems, especially, when there are outliers in data.</p

    A New Subgradient Based Method for Nonsmooth DC Programming

    Get PDF
    The aggregate subgradient method is developed for solving unconstrained nonsmooth difference of convex (DC) optimization problems. The proposed method shares some similarities with both the subgradient and the bundle methods. Aggregate subgradients are defined as a convex combination of subgradients computed at null steps between two serious steps. At each iteration search directions are found using only two subgradients: the aggregate subgradient and a subgradient computed at the current null step. It is proved that the proposed method converges to a critical point of the DC optimization problem and also that the number of null steps between two serious steps is finite. The new method is tested using some academic test problems and compared with several other nonsmooth DC optimization solvers.</p

    Aggregate subgradient method for nonsmooth DC optimization

    Get PDF
    The aggregate subgradient method is developed for solving unconstrained nonsmooth difference of convex (DC) optimization problems. The proposed method shares some similarities with both the subgradient and the bundle methods. Aggregate subgradients are defined as a convex combination of subgradients computed at null steps between two serious steps. At each iteration search directions are found using only two subgradients: the aggregate subgradient and a subgradient computed at the current null step. It is proved that the proposed method converges to a critical point of the DC optimization problem and also that the number of null steps between two serious steps is finite. The new method is tested using some academic test problems and compared with several other nonsmooth DC optimization solvers

    Double Bundle Method for Nonsmooth DC Optimization

    Get PDF
    The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth DC optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new stopping procedure guaranteeing Clarke stationarity for solutions by utilizing only DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not Clarke stationary, then the stopping procedure yields a descent direction. With this new stopping procedure we can avoid some drawbacks, which are encountered when criticality is used. The finite convergence of the method is proved to a Clarke stationary point under mild assumptions. Finally, some encouraging numerical results are presented.</p